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OBJECTfVE 


To  assess  the  influence  of  head  movement  cues  on  perceptual  performance  using 
video-displayed  systems  under  controlled  laboratory  testing  conditions. 


RESULTS 

Comparison  of  direct-view  and  TV-displayed  distance  judgment  estimates  indicates 
that  stereoacuity  is  neither  enhanced  nor  degraded  by  the  pseudo-movement  parallax  cues 
associated  with  stereo  TV  when  measured  with  a  Howard-Dolman  task. 


RECOMMENDATIONS 

Future  work  needs  to  be  done  to  assess  the  contribution  that  movement  parallax 
has  on  operator  performance  when  the  TV  display  and  camera  system  provides  translational 
(parallax)  cues. 
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INTRODUCTION 


The  prime  source  of'  perceptual  information  available  to  operators  of  remote  undersea 
vehicles  is  a  viewing  system  composed  of  a  single  camera  operated  within  commercial  broad¬ 
cast  bandwidth  specifications.  Despite  the  advanced  technology  available  in  today’s  TV 
viewing  systems,  they  continue  to  impose  severe  restrictions  on  the  type  of  underwater  task 
that  can  be  undertaken,  the  speed  and  accuracy  of  the  operator’s  performance  and  the  visibility 
conditions  under  which  the  vehicle  can  operate  effectively.  These  restrictions  result  from  the 
fact  that  the  televised  image  does  not  faithfully  reproduce  the  perceptual  information  avail¬ 
able  to  a  natural  observer  viewing  the  actual  scene.  Loss  of  perceptual  information  results 
from  reduced  resolution  and  contrast,  inadequate  depth  of  field  and  field  of  view,  and  the 
absence  of  the  size,  depth  and  form  information  that  is  derived  from  the  actions  of  the  natural 
observer  (head  and  eye  movements  and  changes  in  accommodation  and  convergence)  as  the 
operator  actively  inspects  a  visual  scene. 

Recent  research  efforts  have  been  directed  toward  improving  viewing  systems  in 
order  to  enhance  perceptual  information  available  to  the  operator  (see  reviews  of  Smith. 

Pepper,  Cole  and  Merritt,  1979,  and  Cole,  1980).  One  of  the  most  promising  developments 
has  been  TV  systems  which  present  image  pairs  to  the  observer  in  a  binocular  disparity  con¬ 
figuration  which  produces  stereopsis.  Smith,  Pepper,  Cole  and  Merritt  ( 1979)  present  evi¬ 
dence  to  indicate  significant  improvements  in  task  performance  utilizing  stereo  TV,  the  degree 
of  improvement  being  dependent  on  visibility  conditions  and  the  amount  of  manipulator 
positioning  required  in  the  depth  plane.  While  stereo  TV  shows  great  promise  as  a  means  of 
improving  remote  operator  performance,  there  are  a  number  of  related  factors  which  under¬ 
mine  the  full  utilization  of  binocular  disparity  cues  produced  with  video  systems. 

CONVERGENCE  AND  ACCOMMODATION  CUES 

Binocular  disparity  produced  by  stereo  TV  conveys  a  sense  of  depth  as  well  as  objec¬ 
tive  data  which  can  be  employed  by  the  subject  to  determine  distance  and  depth  relation¬ 
ships.  However,  only  a  single  convergence  angle  is  employed  in  most  stereo  systems.  That 
is,  the  two  cameras  are  converged  (and  their  images  completely  overlap)  at  a  plane  which 
defines  the  forward  extent  of  the  working  volume  of  the  visual  scene  depicted.  Object 
images  beyond  this  plane  are  increasingly  offset  on  the  monitor  face  in  direct  proportion  to 
their  distance  from  the  convergence  plane,  and  those  objects  are  seen  at  increasing  distances 
from  the  monitor  face.  While  the  subject’s  convergence  varies  (when  viewing  the  TV  display) 
with  increasing  image  offset,  accommodation,  which  is  fixed  at  the  monitor  screen,  does  not 
vary.  Previous  research  indicates  that  when  a  mismatch  occurs  between  convergence  and 
accommodation,  the  perceived  absolute  distance  will  be  a  result  of  a  compromise  between  the 
two  cues  (Ono,  Mitson  and  Seabrook,  1971).  However,  recent  research  (Ritter,  1977)  indi¬ 
cates  that  accommodation  plays  a  relatively  minor  role  in  the  perceptual  processing  of 
information;  the  importance  of  dynamic  convergence  (given  by  the  spatial  separation  of  the 
images  presented  on  the  face  of  the  monitor)  has  yet  to  be  determined.  For  these  and  other 
reasons  to  be  mentioned  later,  the  perceived  depth  in  a  stereo  TV  scene  may  be  less  than 
under  direct  viewing  conditions,  and  conflicting  cues  may  result  in  illusions,  including  false 
perceptions  in  depth. 

MOTION  CUES 

True  motion  parallax  cues  are  not  reproduced  by  stereo  TV  systems.  Thus,  depth 
information,  given  by  the  combination  of  head  and  eye  movements  and  the  resulting 
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movement  of  the  object-images  on  the  retina,  does  not  exist.  Under  normal  direct  viewing 
conditions,  a  stationary  object  viewed  while  moving  the  eye  or  head  appears  stationary 
despite  transformations  (change  in  shape  and  position)  of  the  object’s  image  on  tlie  retina. 

It  has  been  proposed  that  the  observer  compensates  for  his  eye/head  movement  in  evaluat- 
ing  retinal  image  changes  (Rock,  1966).  Compensation  theory  states  that  a  central  mechanism 
exists  which  compares  head  movements  with  retinal  image  movements  or  eye  movements 
required  to  maintain  object  fixation.  In  a  study  to  evaluate  this  position,  Gogel  and  Tietz 
( 1 974)  concluded  that  the  observer’s  perception  of  the  distance  of  the  object  was  a  major 
factor  in  determining  whether  the  object  appeared  to  move  in  a  direction  with  or  against 
head  movement,  or  was  perceived  to  be  stationary. 

In  our  stereo  display,  apparent  motion  of  objects  as  a  result  of  lateral  head  movement 
is  not  in  agreement  with  the  motions  which  are  perceived  under  direct  view  conditions.  The 
nature  and  reasons  for  these  differences  are  revealed  by  a  comparison  of  the  physical 
features  of  the  visual  scenes  and  their  resulting  retinal  images  for  direct  and  stereo  views. 

Under  direct  views,  all  objects  located  on  the  plane  of  the  horopter  (determined  by  the 
fixation  point)  project  images  to  corresponding  areas  of  the  two  retinae,  while  objects  at 
other  locations  project  disparate  images;  crossed  images  are  projected  for  objects  located 
nearer  than  the  horopter  and  uncrossed  images  for  objects  located  beyond.  When  the  head 
is  moved  laterally,  objects  on  the  horopter  appear  to  remain  stationary,  while  near  objects 
appear  to  move  against  the  direction  of  head  movement,  and  objects  beyond  appear  to  move 
with  the  head.  The  amount  of  apparent  movement  of  an  object  is  directly  proportional  to 
its  distance  from  the  horopter.  A  further  consequence  of  these  events  is  that  the  relative 
position  of  objects  at  different  locations  in  the  depth  plane  changes  with  movement  of  the 
head.  For  the  TV  view,  the  convergence  plane  of  the  two  cameras  produces  completely 
overlapping  images,  and  the  effect  is  similar  to  the  horopter  in  the  direct  view:  i.e.,  objects 
at  the  plane  of  the  camera’s  convergence  project  corresponding  images,  and  near  and  far 
objects  project  disparate  crossed  and  uncrossed  images  respectively.  Here  the  similarity  ends, 
however,  because  the  images  of  all  objects  are  painted  on  a  single  plane  in  space;  i.e.,  the  TV 
screen.  With  head  movement,  objects  at  the  convergence  plane  appear  to  remain  stationary, 
and  near  and  far  objects  appear  to  move  in  proportion  to  their  distance  from  that  plane  - 
but  in  a  direction  opposite  to  those  observed  in  the  direct  view.  A  further  sense  of  illusion 
is  produced  by  the  fact  that  while  objects  appear  to  move,  their  relative  positions  remain 
unchanged,  a  consequence  of  the  single  plane  locus  of  all  images  on  the  TV  screen. 

Obviously,  the  normal  stabilization  of  the  visual  scene  that  man  achieves  despite 
head/eye  movements  has  evolved  out  of  necessity;  chaos  would  result  from  having  object- 
images  that  appear  to  move  with  every  movement  of  the  head  and  eyes.  The  fact  that  the 
objects  move  differentially  under  the  stereo  televised  scene  appears  to  be  further  support 
for  the  existence  of  a  compensation  mechanism.  Gogel’s  (1977)  interpretation  of  a  variety 
of  commonly  observed  visual  movement  distortions  specifies  that  the  amount  of  movement 
distortion  perceived  by  the  observer  is  a  direct  function  of  the  perceived  depth  of  the  objects, 
as  the  perceived  depth  is  the  input  information  to  the  compensation  mechanism.  Because 
a  stereo  view  provides  much  more  perceived  depth  than  a  mono  view,  the  input  to  the 
compensation  mechanism  is  much  greater.  In  the  absence  of  actual  object-image  movement 
with  head  motion,  overcompem  ation  results  from  the  expectation  of  movement,  producing 
apparent  movement  of  the  objects  in  the  opposite  direction.  Thus,  the  relationship  between 
perceived  depth  and  the  resulting  motion  distortion  is  predicted  to  be  a  linear  one,  depen¬ 
dent  upon  depth  cues,  irrespective  of  how  those  cues  are  produced.  If  the  relationship  is 
linear,  it  is  possible  that  these  motion  cues  could  be  utilized  by  the  observer  in  making 
stereoacuity  judgments.  The  study  reported  herein  was  designed  to  test  this  proposition. 
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METHOD 


SUBJECTS 

The  five  subjects  in  the  present  study  lU'c  scientists,  engineers  and/or  technicians  em¬ 
ployed  at  tlie  Naval  Ocean  Systems  Center  (NOSC),  Kailua,  Hawaii.  Four  were  male,  one 
was  female,  and  the  age  range  was  from  19  to  49  years.  All  subjects  had  normal  or  corrected- 
to-normal  visual  acuity  as  determined  by  a  Snellen  chart  test  and  normal  binocular  vision 
according  to  the  Keystone  Telebinocular  test. 

APPARATUS 

Figure  1  provides  a  graphic  description  of  the  video  system.  The  system  consisted 
of  two  model  CC  002  RCA  color  video  cameras  fitted  with  Canon  1  ;20  TV  zoom  lens, 

V6  X  17,  with  a  range  of  17  -  102  mm.  Each  camera  was  mounted  on  an  Oriel  adjustable 
slide  mount  in  a  90°  configuration.  A  50/50  half-silvered  beam  splitter  was  positioned  at  a 
45°  angle  between  the  cameras  to  bring  their  1  ne  of  regard  into  parallel.  The  camera  aimed 
through  the  beam  splitter  could  be  moved  laterally  to  produce  the  four  camera  separations 
(disparity  views)  employed  in  this  study  (0.0  cm,  3. 1 25  cm,  6.25  cm  and  1 2.5  cm).  The 
signals  from  the  cameras  were  transmitted  by  means  of  cables  to  two  Model  OOA  1 7/N 
Conrac  monitors  positioned  at  90°  relative  to  one  another.  The  monitor  frames  and  screens 
were  completely  covered  by  polarized  filters  oriented  to  produce  reversed  direction  of 
polarity  for  the  two  monitors.  A  beam  splitting  mirror  located  between  the  monitors  at 
45°  was  aligned  so  that  the  frames  and  screens  of  the  two  monitors  were  superimposed.  The 
observer  was  seated  facing  the  right  monitor  and  at  90°  to  the  left  monitor.  Eyeglasses  con¬ 
taining  polarized  filters  matched  the  polarity  of  the  left  and  right  eye  filters  to  that  of  the 
left  and  right  monitors,  thus  insuring  that  the  left  eye  saw  only  the  left  monitor  and  the 
right  eye  saw  only  the  right  monitor.  With  the  image  of  the  left  monitor  reversed  to  com¬ 
pensate  for  the  reversal  caused  by  the  mirror’s  reflection,  the  observer  experienced  a  stereo¬ 
scopic  view.  The  degree  of  stereopsis  was  determined  by  the  separation  of  the  two  cameras. 
The  right  monitor  was  modified  so  that  the  adjustment  controls  for  contrast  and  bright¬ 
ness  were  located  on  the  right  exterior  panel,  permitting  the  images  on  the  two  monitors 
to  be  matched. 

The  cameras  were  directed  at  a  modified  Howard-Dolman  box  located  at  a  distance 
of  235  cm.  The  box  has  a  window  of  37.5  X  25.0  cm  and  a  depth  of  96.0  cm.  The  diameter 
of  the  left  rod  is  2.495  cm;  the  diameter  of  the  right  rod  is  1.922  cm.  The  subject  was  in¬ 
formed  that  the  rods  were  unequal  in  diameter.  Thus,  he/she  was  to  avoid  size  matching 
and  base  all  judgments  of  depth  on  the  perception  of  three-dimensional  space.  During  the 
experiment,  the  left  rod  was  kept  stationary  in  the  middle  of  the  box  and  the  right  rod  was 
moved  by  the  experimenter. 

PROCEDURE 

Each  of  the  five  subjects  was  gh  en  three  or  four  one-hour  training  sessions.  Each 
session  consisted  of  six  conditions:  two  conditions  where  the  observer  viewed  the  rods 
directly  (Direct  View)  with  either  one  (Monocular)  or  both  (Binocular)  eyes  and  four  con¬ 
ditions  which  utilized  the  TV  viewing  system  (TV  View),  which  included  camera  separation 
conditions  of  0.0  cm,  3. 1 25  cm,  6.25  cm  and  1 2.5  cm.  The  viewing  distance  under  the 
Direct  View  condition  was  set  at  305  cm  in  order  that  the  rods  would  subtend  the  same 
visual  angle  at  the  eye  of  the  observer  as  under  the  TV  View  condition. 
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Twelve  trials  were  ailministered  under  each  ol  tlie  viewinj!  eonditions,  (Jn  hall  the 
trials,  the  suhjeet  was  instructed  to  keep  his/her  head  still,  and  on  the  reniaininjr  six  trials 
was  instructed  to  move  his/her  head  from  side  to  sitle  ±()  inches.  This  condition  was  alter¬ 
nated  every  three  trials.  On  alternating  trials,  the  experimenter  started  moving  the  right  rod 
either  from  the  front  or  from  the  back  of  the  box.  The  subject's  task  was  to  say  "stop"  when 
the  two  rods  were  perceived  to  be  at  the  same  depth.  The  subject  could  then  make  a  final 
adjustment  by  saying  "forward”  or  “back.”  Upon  hearing  the  subject  say  “forward"  or 
"back."  the  experimenter  would  say  “go”  and  begin  to  move  the  right  rod  in  the  appropriate 
direction.  .After  one  or  two  such  final  adjustments,  the  subject  would  say  “good”  to  indi¬ 
cate  that  he,  she  was  satisfied  to  leave  the  rod  there.  The  experimenter  would  record  the 
distance  from  the  equal  point  in  centimeters  with  a  “+"  for  forward  and  a  for  behind. 

The  experimenter  would  then  say  “ready”  and  “go”  to  begin  the  next  trial.  Between  each 
set  of  twelve  trials,  the  experimenter  changed  cameras  or  instructed  the  subject  to  come  from 
the  TV  View  station  to  the  Direct  View  station.  During  the  Direct  View  trials,  the  Howard- 
Dolman  box  was  turned  from  the  cameras  toward  the  viewing  chair.  The  subject  was  then 
asked  to  equalize  the  depth  of  the  rods  with  one  eye  covered  or  with  both  eyes. 

On  all  trials,  the  experimenter  looked  away  so  as  not  to  inadvertently  cue  the  sub¬ 
ject.  A  curtain  separated  the  subject  and  the  experimenter.  The  experimenter  varied  the 
velocity  of  rod  movement  from  one  trial  to  the  next  in  order  to  minimize  the  opportunity 
for  the  subject  to  use  time  as  a  cue. 

Each  subject  was  presented  with  four  experimental  sessions  producing  4  X  6  X  12 
data  points.  For  purposes  of  analysis,  the  mean  of  the  six  trials  for  each  of  the  two  head 
positions  was  calculated.  Each  subject  contributed  4X6X2  data  points  in  the  analy  sis. 


RESULTS  AND  DISCUSSION 

The  results  of  a  within-groups  ANOVA  are  presented  in  table  1.  Mean  error  scores 
were  calculated  across  subjects  and  replications  and  are  plotted  in  figure  2  for  the  two 
Direct  View  and  four  TV  View  conditions  with  head  either  stationary  or  moving.  These 
results  do  not  appear  to  support  the  efficacy  of  head-movement-produced  cues  under  stereo 
TV  View  conditions  since  there  is  no  consistent  improvement  in  performance  under  the 
three  stereo  TV  viewing  conditions  when  the  head  is  moved  as  compared  to  when  it  is  held 
stationary.  (F=  1. 1 6,  p  <  0.34,  df=  1 .4  for  the  main  effects  of  head  movement.)  The 
Direct  View  results,  on  the  other  hand,  do  show  the  expected  improvement  in  stereoacuity 
performance  when  movement-produced  cues  are  available  (F  =  3.90.  p  <  .0125.  df  =  5.20 
for  head  movement  X  viewing  condition  interaction).  Under  monocular  Direct  View,  mean 
error  scores  are  reduced  by  greater  than  half  when  the  head  is  moved.  Under  binocular 
Direct  View,  the  improvement  in  performance  is  much  smaller;  a  result,  in  all  likelihood,  of 
the  predominance  of  binocular  disparity  cues  in  bringing  the  threshold  near  the  actual  point 
of  subjective  equality. 

The  main  effect  of  viewing  condition  also  was  significant  (F  =  10. 16,  p  <  0.001. 
dr=  5.20).  A  comparison  (figure  2)  of  stereoacuity  under  mono  TV (O.O-cm  camera  separa¬ 
tion)  as  compared  to  stereo  TV  (3.125  cm,  6.25  cm  and  12.5  cm)  replicates  our  earlier 
finding  (Pepper.  Cole  and  Smith.  1977)  of  nearly  a  twofold  increase  in  stereoacuity  with  the 
use  of  stereo  TV.  The  largest  gain  occurs  with  increasing  the  camera  separation  from  0.0  cm 
to  3. 1  25  cm.  just  half  the  average  separation  of  human  eyes.  Obviously,  this  gain  is  due  to 
the  fact  that  under  mono  conditions,  size  matching  is  the  dominant  cue  determining 
performance.  Further  increases  in  separation  to  6.25  cm  (average  human  eye  separation) 
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Table  1 .  Repeated-measures  analysis  of  variance  of  depth  judpnient  errnrs 

and  ]  2.50  (double  the  average  separation)  appear  to  bring  successively  less  improvement  in 
stereoacuity. 

A  final  question  that  can  be  addressed  from  these  data  is;  How  efficient  is  this  TV 
.system  in  sensing,  relaying  and  presenting  the  infomiation  at  the  test  site  to  tlie  operator? 
The  ratio  of  Direct  View  to  TV  View  performance  provides  a  useful  index  of  efficiency. 
Based  on  the  ratio  of  mean  error  scores  under  binocular  Direct  View  and  the  6.25-cm 
camera  separation  TV  View,  the  efficiency  rating  of  this  system  for  providing  stercoacuity 
information  is  0.66,  Obviously,  there  is  a  good  deal  of  room  for  improvement  in  this  stereo 
system.  Perhaps  it  is  the  inability  of  the  system  to  reproduce  essential  visual  features  which 
results  in  a  failure  to  produce  superior  performance  under  the  wider  camera  base  conditions 
(hyperstereopsis).  A  similar  comparison  of  monocular  Direct  View  with  the  0.0-cm  TV  view 
gives  an  efficiency  value  of  0,97,  a  .seemingly  good  performance  of  the  TV  system  under 
mono  conditions.  However,  as  mentioned  above,  in  the  absence  of  stereo  depth  cues,  si/.e 
matching  is  probably  the  dominant  cue  in  determining  performance.  This  conclusion  is 
supported  by  the  fact  that  with  the  two  rods  being  of  different  physical  size,  the  observer 
can  only  equate  their  retinal  images  by  positioning  the  small  rod  closer  than  the  larger  rod. 
That  this  is  the  case  can  be  seen  from  the  average  error  settings  of  around  8  cm  when  stereo 
information  is  absent  from  the  TV  View  and  both  stereo-  and  movement-produced  cues  are 
absent  from  the  Direct  View  (figure  2).  This  fact  suggests  that  our  system  is  an  efficient 
carrier  of  information  for  matching  image  size.  On  the  other  hand,  when  we  add  movement- 
produced  information  to  the  Direct  View  comlition,  two  things  happen:  depth  judgment  is 
vastly  improved  and  the  efficiency  value  drops  markedly  ( from  0.97  to  0.38).  This  outcome, 
of  course,  reflects  the  fact  that  our  TV  system  conveys  no  movement-produced  information 
at  O.Ocm  camera  separation  (monocular  model.  It  shows  dramatically,  however,  the  poten¬ 
tial  of  veridical-movement-produceil  cues  (as  opposed  to  the  pseudo-movemcnt-produced 
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cues  of  the  present  study)  for  improving  depth  judgment,  and  argues  strongly  for  future 
efforts  to  develop  and  test  such  cues. 


RECOMMENDATIONS 

While  the  results  of  the  present  study  fail  to  demonstrate  the  utility  of  distorted 
head  movement  cues  in  the  depth  judgment  task,  the  potential  for  veridical  movement  cues 
to  aid  performance  in  more  complex  depth  plane  positioning  tasks  cannot  be  overlooked. 
These  movement  parallax  cues  are  primarily  translational,  and  require  an  isomorphic  coupling 
of  the  camera  system  to  the  head  of  the  operator.  The  complexity  of  such  head-to-camera 
coupling  is  immense,  requiring  control  of  vertical,  lateral  and  longitudinal  movements  in  both 
linear  and  angular  dimensions.  Potential  methods  to  provide  the  appropriate  interface  be¬ 
tween  the  cameras  and  the  head-coupled  display  system  have  been  developed  by  the  aero¬ 
space  industry  to  solve  problems  in  aiming  weapons  in  combat  aircraft. 

Another  important  area  of  research  which  has  been  revealed  in  the  present  study 
involves  the  adequacy  of  stereo  TV  systems  to  provide  sufficient  information  to  enable 
hyperstereoptic  performance.  The  data  presented  in  figure  2  indicate  a  marked  improvement 
in  performance  from  0.0  to  3. 1 25  cm,  but  to  a  much  lesser  extent  when  the  camera  base 
distance  was  increased  to  1 2.5  cm.  If  the  eye  acts  as  an  optical  system  to  resolve  disparity 
(and  thus  enable  judgments  of  distance  or  depth),  the  relationship  between  the  base  separa¬ 
tion  of  the  cameras  and  performance  should  be  direct  and  linear.  Our  data  do  not  show  this 
result  clearly.  The  basis  for  our  findings  may  well  lie  in  the  method  of  presenting  the  stimu¬ 
lus,  in  the  perceptual  quality  of  the  stimulus  rods  themselves,  or  be  due  to  ceiling  effects  on 
the  measurement  of  performance.  A  necessary  next  step  to  understanding  our  results  is  to 
evaluate  hyperstereopsis  under  more  closely  controlled  stimulus  presentation  conditions. 

We  are  presently  modifying  our  laboratory  to  accommodate  such  revised  conditions. 
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